Online adaptation of HMMs to real-life conditions: a unified framework
نویسنده
چکیده
This paper introduces a unified framework for online adaptation of hidden Markov models (HMM) parameters to real-life conditions. Hence, it aims at improving the robustness of speech recognition systems. In addition, it describes some techniques developed to control the convergence of adaptation in unsupervised modes. Classically, two approaches have been used to adapt HMM parameters to new conditions, that is, Bayesian adaptation and spectral transformation—generally using linear regression. This paper lays out a unifying framework where both Bayesian adaptation and spectral transformation adaptation are seen as particular cases. In this sense, the framework attributes one transformation to each Gaussian distribution and partitions the latter automatically with respect to the adaptation data. Thus, the transformations of each class would share the same parameter vector. Consequently, the global transformation gets a data-driven freedom degree. The parameters of the global transformation are determined according to the maximum a posteriori (MAP) criterion using the original HMM a priori distributions. The general adaptation algorithm has been implemented within the CNET speech recognition system and the whole system evaluated on several field-telephone databases. The new adaptation method provides us with a systematic convergence in an online unsupervised mode of the speech recognition system toward a system enrolled with field data in a supervised mode.
منابع مشابه
Incremental on-line speaker adaptation in adverse conditions
In this paper, we examine the use of speaker adaptation in adverse noise conditions. In particular, we focus on incremental on-line speaker adaptation since it, in addition to its other advantages, enables joint speaker and environment adaptation. First, we show that on-line adaptation is superior to off-line adaptation when realistic changing noise conditions are considered. Next, we show that...
متن کاملMaximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions
Compared to fully-continuous HMMs, semi-continuous HMMs are more compact in size, require less data to train well and result in comparable recognition performance with much faster decoding speeds. Nevertheless, the use of semi-continuous HMMs in large vocabulary speech recognition systems has declined considerably in recent years. A significant factor that has contributed this is that systems t...
متن کاملAdaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
This paper describes a technique for synthesizing speech with an arbitrary speaker characteristics using speaker independent speech units, which we call “average voice” units. The technique is based on an HMM-based text-to-speech (TTS) system and MLLR adaptation algorithm. In the HMM-based TTS system, speech synthesis units are modeled by multi-space probability distribution (MSD) HMMs which ca...
متن کاملThe Role of Computer Anxiety in Acceptance of Iranian Public Library Management System Based on the Unified Theory of Acceptance and Use of Technology
Purpose: The main purpose of this study was to measure the acceptance and role of computer anxiety among the users of public libraries in Kerman province while using Iranian Public Library Management System (SAMAN) within the framework of the unified theory of acceptance and use of technology (UTAUT). Method: This is an applied study in terms of purpose and a descriptive study conducted using ...
متن کاملReal-Time Building Information Modeling (BIM) Synchronization Using Radio Frequency Identification Technology and Cloud Computing System
The online observation of a construction site and processes bears significant advantage to all business sector. BIM is the combination of a 3D model of the project and a project-planning program which improves the project planning model by up to 6D (Adding Time, Cost and Material Information dimensions to the model). RFID technology is an appropriate information synchronization tool between the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 9 شماره
صفحات -
تاریخ انتشار 2001